Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Uma Kumari, Monu Malik , Devendra Kumar
DOI Link: https://doi.org/10.22214/ijraset.2023.51486
Certificate: View Certificate
The US National Institutes of Health, it is the successor to SARS-CoV-1, which was the virus that causedSARS outbreak in 2002-2004. SARS-CoV-2 is a virus of the species severe acute respiratory syndrome–related coronavirus (SARS-CoV-2) Genomic analysis of 566SARS CoV2 virus populations to identify mutations as substitutions, deletions, insertions,and single nucleotide polymorphisms(SNPs).Clustal,ClustalOmega and MAFFT in order to align the Indian 566 SARS-CoV-2 sequences. SARS-CoV-2 is an enveloped virus consisting of a positive sense, single-stranded RNA genome of approximately 30 kb. There will be 3 possible reading frames in each direction of the RNA. So total 6 possible reading frame or (6 horizontal bars) would be there for every RNA sequence. +1, +2, +3 and -1, -2 and -3 ( in the reverse strand) are the 6 possible reading frames. The graphical abstract of the human coronavirus NL 63 genome of BLAST is represented by a red bar, which shows the most similar sequences. Bit score is anotherbiostatistical indicator used in addition to the e-value in a blast output and where comparing the sequence similarities search i.e. pairwise multiple sequence alignment shows a way of arranging protein or DNA sequence to identify region or similarity in clustal omega
I. INTRODUCTION
Baricitinib is a pill that seems to fight COVID-19 by reducing inflammation and having antiviral activity. The FDA stated that for patients who are hospitalized due to COVID-19 and require mechanical ventilation or need supplemental oxygen, Barictinib can be used in combination with Redecivir.There are several monoclonal antibody drugs available. These include the combination of bamlanivimab and etesevimab, the combination of two antibodies called casirivimab and imdevimab, and sotrovimab. These drugs are used to treat COVID-19 in people who are at a higher risk of serious illness from COVID-19. Many COVID-19 patients may have mild illness and can be treated with supportive care
However, the vaccine development processis time-consuming and requires analysis of the genetic variability of virus populations inorder to develop effective and safe vaccines for heterogeneous populations [7].In this regard, Tung Phan has performed a genomic analysis to show the evolution of SARS-CoV-2 [8]. To continue this work, we performed genomic analysis of 566SARS CoV2 virus populations to identify mutations as substitutions, deletions, insertions,and single nucleotide polymorphisms(SNPs). Generally speaking, 1% of the populationaffected by substitution is referred to as SNP. We only studied non-synonymous mutationsbecause they are responsible for amino acid changes. In order to find genetic variation it isimportant to have multiple sequence alignments of reference sequences. On the other hand, itis well known that multiple sequence alignment techniques provide almost the best results.Therefore, for the same sequence pool, different alignment techniques can produce differentresults. Therefore, we used four different well-known multiple sequence alignmenttechniques viz. Clustal W [9],Clustal O [10] andMAFFT [11].Align the Indian 566 SARS-CoV-2 sequence. These alignmentresults are then used to identify the mutation list as substitutions, deletions, insertions, andSNPs. A consensus on these results is then created, called Consensus Multiple SequenceAlignment (CMSA), to have the final mutation list so that the benefits of all four alignmenttechniques can keep. It should be noted that identifying SNP helps to classify virus strains, sovaccine design and vaccine dose definition can be effectively carried out [12].Recently the metagenomic analysis using Next-Generation Sequencing (NGS) shows that theSARS-CoV-2 is a single-stranded enveloped RNA virus with a genome length of 27 to 32kilobases [13]. According to NCBI reports, SARS-CoV-2 has 11 codingregions, which encode ORF1ab polyprotein, spike (S) glycoprotein, envelope protein (E),membrane (M) glycoprotein, nucleocapsid (N) protein, and other accessory proteins ORF3a,ORF6, ORF7a ,ORF7b, ORF8 and ORF10. According to further reports, the open readingframe (ORF) can encode several non-structural proteins (NSP). The genomic orientation ofSARS-CoV-2 virus is shown in and their coordinates in supplementary. It is worthmentioning that the virus has a new strain, and the understanding of its genetic variability indifferent countries is still limited. This is another motivation for conducting this study on theIndian SARS-CoV-2 sequence.In December 2019, the outbreak of Severe Acute Respiratory Syndrome Coronavirus 2(SARS-CoV-2) caused severe pneumonia [14]. Since then, it has spread fromWuhan, China to Asia, Europe and the United States, becoming a global pandemic [15]. Severe cases beginning from Huanan Seafood Wholesale market in China whichconfirmed human pneumonia with the infection of a novel coronavirus and named as SARS-CoV-2 by International Committee on Taxonomy of Viruses Current reports singlenucleotide variants are found in many patients with SARS-CoV-2, which belongs to beta-coronavirus species.SARS-CoV-2 contains functional genomic ribonucleic acid (RNA),which is transcribed into structural protein as transmembrane spike (S) glycoprotein, whichuses host cell angiotensin converting enzyme to mediate virus entry into host cell and thenucleocapsid (N) protein holds the major nuclear viral RNA genome; the envelope (E) andmembrane (M) alone with spike protein form viral envelope [16]. Nonstructural RNA genome containing ORF1ab, ORF3, ORF6, 7a, 8, and ORF10 containshighly conserved information about genomic RNA synthesis and replication in ORF1ab andunclear-verified function in other ORF proteins [17]. The propagation(transmission) mechanism initiated by SARSCoV2 binds to the host cellmembrane receptor, and then induces membrane endocytosis into the host cell. ORF1 of theviral genome replicates it and synthesizes subgenomic RNA. At the sane time, N protein andnew genomic RNA assemble to form helical nucleocapsids with M protein inserted inendoplasmic reticulum (ER) and anchored Golgi of host cells [18]. Then the Eand M proteins start to trigger the budding process. S, together with the helix N on themembrane-bound ER, triggers the viral structural proteins required for translation andtransport to the Golgi apparatus. In the last cycle, virus particles are released throughexocytosis to end the life cycle and replication of the virus.
SARS-CoV- 2 is an enveloped virus consisting of a positive, single-stranded RNA genome of approximately 30 kb. Two overlapping ORFs, ORF1a and ORF1b are translated from positive-strand genomic RNA to produce continuous polypeptides that are cleaved into a total of 16 non-structural proteins (NSP). The translation of ORF1b is mediated by a -1 frameshift, which allows translation to continue beyond the stop codon of ORF1a. The negative-strand RNA intermediate is made from the viral genome and is used as a template for the synthesis of genomic positive-strand RNA and subgenomic RNA [19].Subgenomic RNA contains a common 5'leader sequence, 5'cap structure, and 3'poly (A) tail fused to different segments of the 3'end of the viral genome [20]. These distinct fusions occur during negative-strand synthesis at nucleotide core sequences called transcription-regulating sequences (TRSs), that are present at the 3′ end of the leader sequence and also preceding each viral ORF. Various sub genomic RNAs encode four conserved structural proteins- spike (S), envelope (E), membrane (M) and nucleocapsid (N) and also several accessory proteins. On the basis of sequence similarity to other beta coronaviruses, and specifically to SARS-CoV-2, the present annotation of SARS-CoV-2 includes predictions of y 6 accessory proteins (3a, 6, 7a, 7b, 8 and 10, NC_045512). Increased coverage was also observed at the 5' untranslated region (UTR) which reflecting the presence of 5'leader sequences in all sub genomic RNAs and genomic RNAs. The decrease in footprint density between ORF1a and ORF1b reflects the proportion of ribosomes ending with the ORF1a stop codon instead of moving the grid in ORF1b.
By dividing the footprint density in ORF1b by the density in ORF1a, we can estimate the frameshift efficiency to be 57% \u00b1 12%. This value is similar to the frameshift efficiency of mouse hepatitis virus (MHV) measured using Riboseq (48\u201375%) 3. Similar to the observations of MHV and Avian Infectious Bronchitis Virus (IBV) 3,11, we did not observe any obvious pauses of ribosomes before or at the grid movement site, but we found them in ORF1a and ORF1b Several potential pause sites. Except for ORF1a and ORF1b all other classical viral ORFs are translated from subgenomic RNA. Since the original RNAseq density represents the cumulative sum of genomic and subgenomic RNA, we use two methods to calculate the transcription frequency: deconvolution of RNA density, where the RNA expression of each ORF is calculated by dividing the RNAread density of the cumulative subtraction density by the ORF area Upstream; the relative frequency of RNA reads spans the preconductor connection of each classical subgenomic RNA. For most ORFs, there is a high correlation between these two methods, and of the two methods, N transcripts are the most abundant transcripts, consistent with other studies[21]. We next compared footprint densities to RNA abundance. For most viral ORFs, transcription frequency is almost completely correlated with footprint density, which indicates that the translation efficiency of these viral ORFs is similar (perhaps due to their almost identical 5'UTR); however, the three ORFs are outliers. The translation efficiency of ORF1a and ORF1b is significantly lower. This may be due to the different characteristics of their 5'UTR or the underestimation of their true translation efficiency, because some full-length RNA molecules can be used as templates for replication or packaging and therefore are not part of the translated mRNA library. The third outlier is ORF7b, for which we have identified very few body–leader junctions; however it shows relatively high translation, probably due to ribosome leaky scanning of the ORF7a transcript, as was suggested for SARS-CoV-2. Many transcripts derived from non-classical compounds have been identified as SARSCoV29,12. These connections include a combination of the leader and the 3'fragment at an unexpected position in the middle of the ORF (leader-dependent, uncanonical connection) or fusions between sequences, which have no similarities with the leader (leader-independent junction). The corona virus genomes encode five major open reading frames (ORFs), including a 5′ frameshiftedpolyprotein (ORF1a/ORF1ab) and four canonical 3′ structural proteins, namely the spike (S), envelope (E), membrane (M) and nucleocapsid (N) proteins, which are common to all coronaviruses [22].
II. METHODS AND MATERIAL
Computational analysis of sequence alignment is a computer programming for bioinformatics and data Management. NCBI focus on theoretical analytical and applied computational approach and widely used primary database standard protein, BLAST (blast p) programme search protein database using the protein query. A database is usually regulated by database management system. Together, the data and the database management system along with the applications that are related with them, are referred to as a database system. ORF Predictor facilitates annotation of expressed sequence tag-derived sequences particularly for large-scale EST projects.This tool finds Open Reading Frame for corresponding amino acid sequences and convert them into their single letter amino acid code and provides locations in the sequence. pairwise global alignment between the sequences makes it convenient to discover different mutation involved single nucleotide polymorphism. ORF Investigator is written in portable programming language and therefore available to users of all common operating systems. ORF Finder identify all open reading frame using standard genetic codes. Deduced amino acid sequence can be saved in many formats and searched against sequence database using the basic local alignment search tool server. The National Center for Biotechnology Information is the branch of the United States National Library of Medicine. It is accepted and funded by the government of the United States. The National Center for Biotechnology Information is situated in Bethesda, Maryland and was established in 1988. NCBI serves as an international resource for the scientific research community – providing approach to public databases and software tools for analyzing biological data, as well as performing research in computational biology. The NCBI is made up of multidisciplinary research and development teams composed of molecular biologists,biochemists, clinicians, Assemble scientific and medical research data from around the globe • Serve as the immense repository of the world’s primary biological research data • Produce curate datasets to enhance the value and usabiliEntrez: The Entrez Global Query Cross Database Search System is a federated search engine, or web portal that allows users to search various individual health sciences databases. NCBI distributed the first variety of Entrez in 1991, composed of nucleotide sequences from PDB and GenBank, protein sequences from SWISS-PROT, translated GenBank, PIR, PRF, PDB.ty of the primary data.
III. RESULT AND DISCUSION
Severe acute respiratory syndrome coronavirus 2(SARS CoV 2) is the virus that causes COVD-19 COVID-19(coronavirus disease 2019), the respiratory illness responsible for the epidemic.The molecular analysis of the sequence of genome organization by applying bioinformatics tools NCBI, BLAST, FASTA, CLUSTAL OMEGA, and ORF.
The “blastn” program is a general purpose nucleotide search and alignment program that is sensitive and can be used to align rRNA or tRNA sequences and also mRNA or genomic DNA sequences containing a mix of coding and noncoding regions. The web BLAST represents the basic local alignment search tool is an algorithm for comparing primary biological sequence information such as beta coronavirus protein sequence-Value is increased from default value, larger lists with more low scoring his can be reported based on quality of alignment. Human Coronavirus NL63 (HCoVNL63) is a coronavirus, especially from the genus Alpha coronavirus. The virus is an enveloped, positive-sense, single-stranded virus which enters its host cell by binding to ACE2. Modern technology enables people to make a powerful and rapid response to the research of this virus, which has never been seen in past virus outbreaks. The free online database allows students to access cutting-edge genome data of the virus that causes COVID-19 and SARS CoV-2.
>MK342133.1 Human coronavirus NL63 strain ChinaGD14 spike gene, partial cds
The computational analysis of SARS-CoV-2 describes the specificity of protein structure, function, phylogeny and interaction at both molecular and sequence levels. The Blast algorithm compares the database sequence with the query protein. BLASTE-value(anticipated value) is a parameter that specifies the number of successes that can be anticipated by chance when a searching a database of certain size.As the matching score increases the E-value decreases exponentially. There will be 3 possible reading frames in each direction of RNA. Data mining by genome alignment analysis to catch mutation induce severe deadly threat to human.In significantly observed that SARS-COV-2 RBD displayed remarkably the higher binding affinity to ACE2 receptors.The overall results of the database will play a more important role in future inspections against the coronavirus.
[1] Sanche S, Lin YT, Xu C, Romero-Severson E, Hengartner N, Ke R, “High Contagiousness and Rapid Spread of Severe Acute Respiratory Syndrome Coronavirus 2”, Emerg Infect Dis, 26(7):1470-1477, 2020. [2] Gutierrez B, Márquez S, Prado-Vivar B, et al.“Genomic epidemiology of SARS-CoV-2 transmission lineages in Ecuador”, Virus Evolution, 7(2), 2021. [3] Zhao J, Yuan Q, Wang H, et al., “Antibody Responses to SARS-CoV-2 in Patients With Novel Coronavirus Disease 2019”, Clin Infect Dis, 71(16):2027-2034, 2020. [4] Hoskins SG, Stevens LM, Nehm RH, “Selective use of the primary literature transforms the classroom into a virtual laboratory”, Genetics, 176(3):1381-1389, 2007. [5] Wiertelak EP, Frenzel KE, Roesch LA, “Case Studies and Neuroscience Education: Tools for Effective Teaching”, J Undergrad NeurosciEduc, 14(2):E13-E14, 2016. [6] Smith AC, Thomas E, Snoswell CL, et al., “Telehealth for global emergencies: Implications for coronavirus disease 2019 (COVID-19)”, J TelemedTelecare, 26(5):309-313, 2020. [7] Poland GA. “Poland Tortoises, hares, and vaccines: a cautionary note for SARS-CoV-2 vaccine development”, Vaccine, 38 (2020), pp. 4219-4220, 2020. [8] Phan T, “Genetic diversity and evolution of SARS-CoV-2”, Infect Genet Evol, 81:104260, 2020. [9] Thompson JD, Higgins DG, Gibson TJ, “CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice” Nucleic Acids Res, 22(22):4673-4680, 1994. [10] Sievers F, Wilm A, Dineen D, et al., “Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega” MolSystBiol, 7:539, 2011. [11] Katoh K, Rozewicki J, Yamada KD, “MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization” Brief Bioinform, 20(4):1160-1166, 2019. [12] Jeon JS, Won YH, Kim IK, et al., “Analysis of single nucleotide polymorphism among Varicella-Zoster Virus and identification of vaccine-specific sites”, Virology, 496:277-286, 2016. [13] Vellingiri B, Jayaramayya K, Iyer M, et al., “COVID-19: A promising cure for the global panic”, Sci Total Environ, 725:138277, 2020. [14] Fehr AR, Channappanavar R, Perlman S, “Middle East Respiratory Syndrome: Emergence of a Pathogenic Human Coronavirus”, Annu Rev Med, 68:387-399, 2017. [15] De Wit E, van Doremalen N, Falzarano D, Munster VJ, “SARS and MERS: recent insights into emerging coronaviruses”, Nat Rev Microbiol, 14(8):523-534, 2016. [16] Marra MA, Jones SJ, Astell CR, et al., “The Genome sequence of the SARS-associated coronavirus”, Science, 300(5624):1399-1404, 2003. [17] Rota PA, Oberste MS, Monroe SS, et al., “Characterization of a novel coronavirus associated with severe acute respiratory syndrome” Science, 300(5624):1394-1399, 2003. [18] Andersen KG, Rambaut A, Lipkin WI, Holmes EC, Garry RF, “The proximal origin of SARS-CoV-2”, Nat Med, 26(4):450-452, 2020. [19] Sola I, Almazán F, Zúñiga S, Enjuanes L, “Continuous and Discontinuous RNA Synthesis in Coronaviruses”, Annu Rev Virol, 2(1):265-288, 2015. [20] Lai MM, Stohlman SA, “Comparative analysis of RNA genomes of mouse hepatitis viruses”, J Virol, 38(2):661-670, 1981. [21] Davidson AD, Williamson MK, Lewis S, et al., “Characterisation of the transcriptome and proteome of SARS-CoV-2 reveals a cell passage induced in-frame deletion of the furin-like cleavage site from the spike glycoprotein”, Genome Med, 12(1):68, 2020. [22] Stuti, Uma Kumari, “Genome Sequence Analysis of Beta Coronavirus by Applying Bioinformatics Tools”, BJBS, 1(1): 49-54, 2021. [23] Kumari U, Choudhar AK, “Genome Sequence Analysis of SolanumLycopersicum by Applying Sequence Alignment Method to Determine the Statistical Significance of an Alignment”, International Journal of Bio-Technology and Research (IJBTR), 2249-6858;9-12, 2016. [24] Vinita Kukreja, Uma Kumari, \"Genome Annotation of Brain Cancer and Structure Analysis by applying Drug Designing Technique\", International Journal of Emerging Technologies and Innovative Research, 9(5);473-k479, May, 2022.
Copyright © 2023 Uma Kumari, Monu Malik , Devendra Kumar. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET51486
Publish Date : 2023-05-03
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here